Credit Card Defaults are the most common these days. The goal of this Capstone project is to carry out exploratory and predictive data analysis of the credit card holders based in the Taiwan region to predict the defaulters for the next month. The factors that are taken into consideration are as follows :
This dataset contains information on default payments, demographic factors, credit data, history of payment, and bill statements of credit card clients in Taiwan from April 2005 to September 2005.
There are 25 features:
ID: ID of each client
LIMIT_BAL: Amount of given credit in NT dollars (includes individual and family/supplementary credit
SEX: Gender (1=male, 2=female)
EDUCATION: (1=graduate school, 2=university, 3=high school, 4=others, 5=unknown, 6=unknown)
MARRIAGE: Marital status (1=married, 2=single, 3=others)
AGE: Age in years
PAY_0: Repayment status in September, 2005 (-1=pay duly, 1=payment delay for one month, 2=payment delay for two months, ... 8=payment delay for eight months, 9=payment delay for nine months and above)
PAY_2: Repayment status in August, 2005 (scale same as above)
PAY_3: Repayment status in July, 2005 (scale same as above)
PAY_4: Repayment status in June, 2005 (scale same as above)
PAY_5: Repayment status in May, 2005 (scale same as above)
PAY_6: Repayment status in April, 2005 (scale same as above)
BILL_AMT1: Amount of bill statement in September, 2005 (NT dollar)
BILL_AMT2: Amount of bill statement in August, 2005 (NT dollar)
BILL_AMT3: Amount of bill statement in July, 2005 (NT dollar)
BILL_AMT4: Amount of bill statement in June, 2005 (NT dollar)
BILL_AMT5: Amount of bill statement in May, 2005 (NT dollar)
BILL_AMT6: Amount of bill statement in April, 2005 (NT dollar)
PAY_AMT1: Amount of previous payment in September, 2005 (NT dollar)
PAY_AMT2: Amount of previous payment in August, 2005 (NT dollar)
PAY_AMT3: Amount of previous payment in July, 2005 (NT dollar)
PAY_AMT4: Amount of previous payment in June, 2005 (NT dollar)
PAY_AMT5: Amount of previous payment in May, 2005 (NT dollar)
PAY_AMT6: Amount of previous payment in April, 2005 (NT dollar)
default.payment.next.month: Default payment (1=yes, 0=no)
Lets have a look at the dataset.
import numpy as np # linear algebra
import os # accessing directory structure
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import seaborn as sns
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')
creditdata = pd.read_csv("C:/Users/shrey/Desktop/Credit_card_project/Capstone/DA_Credit_Card.csv")
creditdata
| ID | LIMIT_BAL | SEX | EDUCATION | MARRIAGE | AGE | PAY_0 | PAY_2 | PAY_3 | PAY_4 | ... | Is Average greater than 10k and less than 30k | Is Average greater than 30k and less than 50k | Is Average greater than 50k and less than 70k | Is Average greater than 70k and less than 100k | DUE_1 | DUE_2 | DUE_3 | DUE_4 | DUE_5 | DUE_6 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 20000.0 | 2 | 2 | 1 | 24 | 2 | 2 | -1 | -1 | ... | 0 | 0 | 0 | 0 | 3913 | 2413 | 689 | 0 | 0 | 0 |
| 1 | 2 | 120000.0 | 2 | 2 | 2 | 26 | -1 | 2 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 2682 | 725 | 1682 | 2272 | 3455 | 1261 |
| 2 | 3 | 90000.0 | 2 | 2 | 2 | 34 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 27721 | 12527 | 12559 | 13331 | 13948 | 10549 |
| 3 | 4 | 50000.0 | 2 | 2 | 1 | 37 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 44990 | 46214 | 48091 | 27214 | 27890 | 28547 |
| 4 | 5 | 50000.0 | 1 | 2 | 1 | 57 | -1 | 0 | -1 | 0 | ... | 0 | 0 | 0 | 0 | 6617 | -31011 | 25835 | 11940 | 18457 | 18452 |
| 5 | 6 | 50000.0 | 1 | 1 | 2 | 37 | 0 | 0 | 0 | 0 | ... | 0 | 1 | 0 | 0 | 61900 | 55254 | 56951 | 18394 | 18619 | 19224 |
| 6 | 7 | 500000.0 | 1 | 1 | 2 | 29 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 312965 | 372023 | 407007 | 522414 | 469253 | 460174 |
| 7 | 8 | 100000.0 | 2 | 2 | 2 | 23 | 0 | -1 | -1 | 0 | ... | 0 | 0 | 0 | 0 | 11496 | -221 | 601 | -360 | -1846 | -975 |
| 8 | 9 | 140000.0 | 2 | 3 | 1 | 28 | 0 | 0 | 2 | 0 | ... | 0 | 0 | 0 | 0 | 7956 | 14096 | 11676 | 11211 | 10793 | 2719 |
| 9 | 10 | 20000.0 | 1 | 3 | 2 | 35 | -2 | -2 | -2 | -2 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | -13007 | 11885 | 13912 |
| 10 | 11 | 200000.0 | 2 | 3 | 2 | 34 | 0 | 0 | 2 | 0 | ... | 0 | 0 | 0 | 0 | 8767 | 9775 | 5485 | 2213 | -1910 | 3665 |
| 11 | 12 | 260000.0 | 2 | 1 | 2 | 51 | -1 | -1 | -1 | -1 | ... | 0 | 0 | 0 | 0 | -9557 | 11704 | 1383 | -13784 | 22287 | 10028 |
| 12 | 13 | 630000.0 | 2 | 2 | 2 | 41 | -1 | 0 | -1 | -1 | ... | 0 | 0 | 0 | 0 | 11137 | 0 | 0 | 0 | 3630 | 2870 |
| 13 | 14 | 70000.0 | 1 | 2 | 2 | 30 | 1 | 2 | 2 | 0 | ... | 0 | 0 | 0 | 0 | 62602 | 67369 | 62701 | 63782 | 34637 | 36894 |
| 14 | 15 | 250000.0 | 1 | 1 | 2 | 29 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 67887 | 64060 | 60561 | 56696 | 53875 | 52512 |
| 15 | 16 | 50000.0 | 2 | 3 | 3 | 23 | 1 | 2 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 50614 | 27673 | 27016 | 27571 | 28231 | 29111 |
| 16 | 17 | 20000.0 | 1 | 1 | 2 | 24 | 0 | 0 | 2 | 2 | ... | 0 | 0 | 0 | 0 | 12176 | 18010 | 15928 | 18338 | 16255 | 19104 |
| 17 | 18 | 320000.0 | 1 | 1 | 1 | 49 | 0 | 0 | 0 | -1 | ... | 1 | 0 | 0 | 0 | 242928 | 236536 | 118723 | 50074 | -189743 | 145599 |
| 18 | 19 | 360000.0 | 2 | 1 | 1 | 49 | 1 | -2 | -2 | -2 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 19 | 20 | 180000.0 | 2 | 1 | 2 | 29 | 1 | -2 | -2 | -2 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 20 | 21 | 130000.0 | 2 | 3 | 2 | 39 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 35358 | 26151 | 23489 | 18616 | 10872 | -32834 |
| 21 | 22 | 120000.0 | 2 | 2 | 1 | 39 | -1 | -1 | -1 | -1 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 316 | -632 | 316 | 316 |
| 22 | 23 | 70000.0 | 2 | 2 | 2 | 26 | 2 | 0 | 0 | 2 | ... | 0 | 0 | 0 | 0 | 39080 | 38863 | 45020 | 40405 | 46905 | 44192 |
| 23 | 24 | 450000.0 | 2 | 1 | 1 | 40 | -2 | -2 | -2 | -2 | ... | 0 | 0 | 0 | 0 | -13916 | 17947 | 913 | 560 | 0 | -1128 |
| 24 | 25 | 90000.0 | 1 | 1 | 2 | 23 | 0 | 0 | 0 | -1 | ... | 0 | 0 | 0 | 0 | -1013 | 7070 | -5398 | 4198 | 4315 | 6292 |
| 25 | 26 | 50000.0 | 1 | 3 | 2 | 23 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 45647 | 40384 | 35022 | 27535 | 28767 | 29049 |
| 26 | 27 | 60000.0 | 1 | 1 | 2 | 27 | 1 | -2 | -1 | -1 | ... | 0 | 0 | 0 | 0 | -109 | -1425 | 259 | -557 | 127 | -1189 |
| 27 | 28 | 50000.0 | 2 | 3 | 2 | 30 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 21241 | 14838 | 16163 | 16378 | 17931 | 18605 |
| 28 | 29 | 50000.0 | 2 | 3 | 1 | 47 | -1 | -1 | -1 | -1 | ... | 0 | 0 | 0 | 0 | -2765 | -6 | 1372 | -28390 | 30173 | 257 |
| 29 | 30 | 50000.0 | 1 | 1 | 2 | 26 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 13829 | 15075 | 16496 | 16907 | 16775 | 11400 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 29970 | 29971 | 360000.0 | 1 | 1 | 1 | 34 | -1 | -1 | -1 | 0 | ... | 0 | 0 | 0 | 0 | -19297 | -11849 | 55162 | 48952 | -10908 | 3407 |
| 29971 | 29972 | 80000.0 | 1 | 3 | 1 | 36 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 63159 | 64358 | 65749 | 67118 | 67370 | 70612 |
| 29972 | 29973 | 190000.0 | 1 | 1 | 1 | 37 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 19628 | 17024 | -19259 | 19108 | -128866 | 143682 |
| 29973 | 29974 | 230000.0 | 1 | 2 | 1 | 35 | 1 | -2 | -2 | -2 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 29974 | 29975 | 50000.0 | 1 | 2 | 1 | 37 | 1 | 2 | 2 | 2 | ... | 0 | 0 | 0 | 0 | 10904 | 6316 | 4328 | 2846 | 585 | 324 |
| 29975 | 29976 | 220000.0 | 1 | 2 | 1 | 41 | 0 | 0 | -1 | -1 | ... | 0 | 0 | 0 | 0 | 36235 | 2197 | -4555 | 4165 | -65 | -5198 |
| 29976 | 29977 | 40000.0 | 1 | 2 | 2 | 47 | 2 | 2 | 3 | 2 | ... | 0 | 0 | 0 | 0 | 48358 | 54892 | 51415 | 51259 | 43631 | 46934 |
| 29977 | 29978 | 420000.0 | 1 | 1 | 2 | 34 | 0 | 0 | 0 | 0 | ... | 1 | 0 | 0 | 0 | 124939 | 129721 | 134511 | 136195 | 139239 | 142954 |
| 29978 | 29979 | 310000.0 | 1 | 2 | 1 | 39 | 0 | 0 | 0 | 0 | ... | 1 | 0 | 0 | 0 | 228944 | 227978 | 223825 | 211360 | 208500 | 200616 |
| 29979 | 29980 | 180000.0 | 1 | 1 | 1 | 32 | -2 | -2 | -2 | -2 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 29980 | 29981 | 50000.0 | 1 | 3 | 2 | 42 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 33998 | 45123 | 44397 | 47360 | 15471 | 17694 |
| 29981 | 29982 | 50000.0 | 1 | 2 | 1 | 44 | 1 | 2 | 2 | 2 | ... | 0 | 0 | 0 | 0 | 36371 | 35072 | 33101 | 27675 | 22173 | 14062 |
| 29982 | 29983 | 90000.0 | 1 | 2 | 1 | 36 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 6252 | 7612 | 8806 | 10128 | 9536 | 14329 |
| 29983 | 29984 | 20000.0 | 1 | 2 | 1 | 44 | -2 | -2 | -2 | -2 | ... | 0 | 0 | 0 | 0 | -1068 | 152 | -178 | -6381 | 7411 | 18 |
| 29984 | 29985 | 30000.0 | 1 | 2 | 2 | 38 | -1 | -1 | -2 | -1 | ... | 0 | 0 | 0 | 0 | -608 | -2054 | 940 | -1064 | -1412 | 2319 |
| 29985 | 29986 | 240000.0 | 1 | 1 | 2 | 30 | -2 | -2 | -2 | -2 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 29986 | 29987 | 360000.0 | 1 | 1 | 2 | 35 | -1 | -1 | -2 | -2 | ... | 0 | 0 | 0 | 0 | 2220 | 0 | 0 | 0 | 0 | 0 |
| 29987 | 29988 | 130000.0 | 1 | 1 | 2 | 34 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 20292 | 12077 | -77454 | 104047 | 88681 | 93348 |
| 29988 | 29989 | 250000.0 | 1 | 1 | 1 | 34 | 0 | 0 | 0 | 0 | ... | 1 | 0 | 0 | 0 | 214640 | 244113 | 234064 | 239750 | 168005 | 173678 |
| 29989 | 29990 | 150000.0 | 1 | 1 | 2 | 35 | -1 | -1 | -1 | -1 | ... | 0 | 0 | 0 | 0 | -5629 | 9009 | -786 | 780 | 0 | 0 |
| 29990 | 29991 | 140000.0 | 1 | 2 | 1 | 41 | 0 | 0 | 0 | 0 | ... | 1 | 0 | 0 | 0 | 132325 | 130142 | 134882 | 136757 | 47675 | 44121 |
| 29991 | 29992 | 210000.0 | 1 | 2 | 1 | 34 | 3 | 2 | 2 | 2 | ... | 0 | 0 | 0 | 0 | 2500 | 2500 | 2500 | 2500 | 2500 | 2500 |
| 29992 | 29993 | 10000.0 | 1 | 3 | 1 | 43 | 0 | 0 | 0 | -2 | ... | 0 | 0 | 0 | 0 | 6802 | 10400 | 0 | 0 | 0 | 0 |
| 29993 | 29994 | 100000.0 | 1 | 1 | 2 | 38 | 0 | -1 | -1 | 0 | ... | 0 | 0 | 0 | 0 | 1042 | -110357 | 98996 | 67626 | 67473 | 53004 |
| 29994 | 29995 | 80000.0 | 1 | 2 | 2 | 34 | 2 | 2 | 2 | 2 | ... | 0 | 0 | 0 | 0 | 65557 | 74208 | 79384 | 70519 | 82607 | 77158 |
| 29995 | 29996 | 220000.0 | 1 | 3 | 1 | 39 | 0 | 0 | 0 | 0 | ... | 1 | 0 | 0 | 0 | 180448 | 172815 | 203362 | 84957 | 26237 | 14980 |
| 29996 | 29997 | 150000.0 | 1 | 3 | 2 | 43 | -1 | -1 | -1 | -1 | ... | 0 | 0 | 0 | 0 | -154 | -1698 | -5496 | 8850 | 5190 | 0 |
| 29997 | 29998 | 30000.0 | 1 | 2 | 2 | 37 | 4 | 3 | 2 | -1 | ... | 0 | 0 | 0 | 0 | 3565 | 3356 | -19242 | 16678 | 18582 | 16257 |
| 29998 | 29999 | 80000.0 | 1 | 3 | 1 | 41 | 1 | -1 | 0 | 0 | ... | 0 | 0 | 0 | 0 | -87545 | 74970 | 75126 | 50848 | -41109 | 47140 |
| 29999 | 30000 | 50000.0 | 1 | 2 | 1 | 46 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 45851 | 47105 | 48334 | 35535 | 31428 | 14313 |
30000 rows Ă— 38 columns
creditdata.describe()
| ID | LIMIT_BAL | SEX | EDUCATION | MARRIAGE | AGE | PAY_0 | PAY_2 | PAY_3 | PAY_4 | ... | Is Average greater than 10k and less than 30k | Is Average greater than 30k and less than 50k | Is Average greater than 50k and less than 70k | Is Average greater than 70k and less than 100k | DUE_1 | DUE_2 | DUE_3 | DUE_4 | DUE_5 | DUE_6 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 30000.000000 | 30000.000000 | 30000.000000 | 30000.000000 | 30000.000000 | 30000.000000 | 30000.000000 | 30000.000000 | 30000.000000 | 30000.000000 | ... | 30000.000000 | 30000.000000 | 30000.000000 | 30000.000000 | 30000.000000 | 3.000000e+04 | 3.000000e+04 | 30000.00000 | 30000.000000 | 30000.000000 |
| mean | 15000.500000 | 167484.322667 | 1.603733 | 1.853133 | 1.551867 | 35.485500 | -0.016700 | -0.133767 | -0.166200 | -0.220667 | ... | 0.126500 | 0.009633 | 0.000400 | 0.000400 | 45559.750400 | 4.325791e+04 | 4.178747e+04 | 38436.87210 | 35512.013333 | 33656.257833 |
| std | 8660.398374 | 129747.661567 | 0.489129 | 0.790349 | 0.521970 | 9.217904 | 1.123802 | 1.197186 | 1.196868 | 1.169139 | ... | 0.332418 | 0.097677 | 0.019996 | 0.019996 | 73173.789447 | 7.256594e+04 | 6.929536e+04 | 64200.61083 | 60553.370054 | 60151.290836 |
| min | 1.000000 | 10000.000000 | 1.000000 | 0.000000 | 0.000000 | 21.000000 | -2.000000 | -2.000000 | -2.000000 | -2.000000 | ... | 0.000000 | 0.000000 | 0.000000 | 0.000000 | -733744.000000 | -1.702347e+06 | -8.546410e+05 | -667000.00000 | -414380.000000 | -684896.000000 |
| 25% | 7500.750000 | 50000.000000 | 1.000000 | 1.000000 | 1.000000 | 28.000000 | -1.000000 | -1.000000 | -1.000000 | -1.000000 | ... | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 745.000000 | 3.295000e+02 | 2.627500e+02 | 230.00000 | 0.000000 | 0.000000 |
| 50% | 15000.500000 | 140000.000000 | 2.000000 | 2.000000 | 2.000000 | 34.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | ... | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 18550.500000 | 1.810250e+04 | 1.776900e+04 | 16970.00000 | 15538.000000 | 13926.500000 |
| 75% | 22500.250000 | 240000.000000 | 2.000000 | 2.000000 | 2.000000 | 41.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | ... | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 62241.500000 | 5.907775e+04 | 5.629425e+04 | 50259.50000 | 46961.500000 | 46067.250000 |
| max | 30000.000000 | 1000000.000000 | 2.000000 | 6.000000 | 3.000000 | 79.000000 | 8.000000 | 8.000000 | 8.000000 | 8.000000 | ... | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 913727.000000 | 9.332080e+05 | 1.542258e+06 | 841586.00000 | 877171.000000 | 911408.000000 |
8 rows Ă— 38 columns
creditdata.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 30000 entries, 0 to 29999 Data columns (total 38 columns): ID 30000 non-null int64 LIMIT_BAL 30000 non-null float64 SEX 30000 non-null int64 EDUCATION 30000 non-null int64 MARRIAGE 30000 non-null int64 AGE 30000 non-null int64 PAY_0 30000 non-null int64 PAY_2 30000 non-null int64 PAY_3 30000 non-null int64 PAY_4 30000 non-null int64 PAY_5 30000 non-null int64 PAY_6 30000 non-null int64 BILL_AMT1 30000 non-null float64 BILL_AMT2 30000 non-null float64 BILL_AMT3 30000 non-null float64 BILL_AMT4 30000 non-null float64 BILL_AMT5 30000 non-null float64 BILL_AMT6 30000 non-null float64 PAY_AMT1 30000 non-null float64 PAY_AMT2 30000 non-null float64 PAY_AMT3 30000 non-null float64 PAY_AMT4 30000 non-null float64 PAY_AMT5 30000 non-null float64 PAY_AMT6 30000 non-null float64 default.payment.next.month 30000 non-null int64 Number of missed payments 30000 non-null int64 Average Bill Amount (TD) 30000 non-null float64 Is Average Bill Amount less than 10K? 30000 non-null int64 Is Average greater than 10k and less than 30k 30000 non-null int64 Is Average greater than 30k and less than 50k 30000 non-null int64 Is Average greater than 50k and less than 70k 30000 non-null int64 Is Average greater than 70k and less than 100k 30000 non-null int64 DUE_1 30000 non-null int64 DUE_2 30000 non-null int64 DUE_3 30000 non-null int64 DUE_4 30000 non-null int64 DUE_5 30000 non-null int64 DUE_6 30000 non-null int64 dtypes: float64(14), int64(24) memory usage: 8.7 MB
cdata = creditdata.isnull().sum()
cdata
ID 0 LIMIT_BAL 0 SEX 0 EDUCATION 0 MARRIAGE 0 AGE 0 PAY_0 0 PAY_2 0 PAY_3 0 PAY_4 0 PAY_5 0 PAY_6 0 BILL_AMT1 0 BILL_AMT2 0 BILL_AMT3 0 BILL_AMT4 0 BILL_AMT5 0 BILL_AMT6 0 PAY_AMT1 0 PAY_AMT2 0 PAY_AMT3 0 PAY_AMT4 0 PAY_AMT5 0 PAY_AMT6 0 default.payment.next.month 0 Number of missed payments 0 Average Bill Amount (TD) 0 Is Average Bill Amount less than 10K? 0 Is Average greater than 10k and less than 30k 0 Is Average greater than 30k and less than 50k 0 Is Average greater than 50k and less than 70k 0 Is Average greater than 70k and less than 100k 0 DUE_1 0 DUE_2 0 DUE_3 0 DUE_4 0 DUE_5 0 DUE_6 0 dtype: int64
new_data = pd.DataFrame(creditdata)
new_data
| ID | LIMIT_BAL | SEX | EDUCATION | MARRIAGE | AGE | PAY_0 | PAY_2 | PAY_3 | PAY_4 | ... | Is Average greater than 10k and less than 30k | Is Average greater than 30k and less than 50k | Is Average greater than 50k and less than 70k | Is Average greater than 70k and less than 100k | DUE_1 | DUE_2 | DUE_3 | DUE_4 | DUE_5 | DUE_6 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 20000.0 | 2 | 2 | 1 | 24 | 2 | 2 | -1 | -1 | ... | 0 | 0 | 0 | 0 | 3913 | 2413 | 689 | 0 | 0 | 0 |
| 1 | 2 | 120000.0 | 2 | 2 | 2 | 26 | -1 | 2 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 2682 | 725 | 1682 | 2272 | 3455 | 1261 |
| 2 | 3 | 90000.0 | 2 | 2 | 2 | 34 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 27721 | 12527 | 12559 | 13331 | 13948 | 10549 |
| 3 | 4 | 50000.0 | 2 | 2 | 1 | 37 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 44990 | 46214 | 48091 | 27214 | 27890 | 28547 |
| 4 | 5 | 50000.0 | 1 | 2 | 1 | 57 | -1 | 0 | -1 | 0 | ... | 0 | 0 | 0 | 0 | 6617 | -31011 | 25835 | 11940 | 18457 | 18452 |
| 5 | 6 | 50000.0 | 1 | 1 | 2 | 37 | 0 | 0 | 0 | 0 | ... | 0 | 1 | 0 | 0 | 61900 | 55254 | 56951 | 18394 | 18619 | 19224 |
| 6 | 7 | 500000.0 | 1 | 1 | 2 | 29 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 312965 | 372023 | 407007 | 522414 | 469253 | 460174 |
| 7 | 8 | 100000.0 | 2 | 2 | 2 | 23 | 0 | -1 | -1 | 0 | ... | 0 | 0 | 0 | 0 | 11496 | -221 | 601 | -360 | -1846 | -975 |
| 8 | 9 | 140000.0 | 2 | 3 | 1 | 28 | 0 | 0 | 2 | 0 | ... | 0 | 0 | 0 | 0 | 7956 | 14096 | 11676 | 11211 | 10793 | 2719 |
| 9 | 10 | 20000.0 | 1 | 3 | 2 | 35 | -2 | -2 | -2 | -2 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | -13007 | 11885 | 13912 |
| 10 | 11 | 200000.0 | 2 | 3 | 2 | 34 | 0 | 0 | 2 | 0 | ... | 0 | 0 | 0 | 0 | 8767 | 9775 | 5485 | 2213 | -1910 | 3665 |
| 11 | 12 | 260000.0 | 2 | 1 | 2 | 51 | -1 | -1 | -1 | -1 | ... | 0 | 0 | 0 | 0 | -9557 | 11704 | 1383 | -13784 | 22287 | 10028 |
| 12 | 13 | 630000.0 | 2 | 2 | 2 | 41 | -1 | 0 | -1 | -1 | ... | 0 | 0 | 0 | 0 | 11137 | 0 | 0 | 0 | 3630 | 2870 |
| 13 | 14 | 70000.0 | 1 | 2 | 2 | 30 | 1 | 2 | 2 | 0 | ... | 0 | 0 | 0 | 0 | 62602 | 67369 | 62701 | 63782 | 34637 | 36894 |
| 14 | 15 | 250000.0 | 1 | 1 | 2 | 29 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 67887 | 64060 | 60561 | 56696 | 53875 | 52512 |
| 15 | 16 | 50000.0 | 2 | 3 | 3 | 23 | 1 | 2 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 50614 | 27673 | 27016 | 27571 | 28231 | 29111 |
| 16 | 17 | 20000.0 | 1 | 1 | 2 | 24 | 0 | 0 | 2 | 2 | ... | 0 | 0 | 0 | 0 | 12176 | 18010 | 15928 | 18338 | 16255 | 19104 |
| 17 | 18 | 320000.0 | 1 | 1 | 1 | 49 | 0 | 0 | 0 | -1 | ... | 1 | 0 | 0 | 0 | 242928 | 236536 | 118723 | 50074 | -189743 | 145599 |
| 18 | 19 | 360000.0 | 2 | 1 | 1 | 49 | 1 | -2 | -2 | -2 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 19 | 20 | 180000.0 | 2 | 1 | 2 | 29 | 1 | -2 | -2 | -2 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 20 | 21 | 130000.0 | 2 | 3 | 2 | 39 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 35358 | 26151 | 23489 | 18616 | 10872 | -32834 |
| 21 | 22 | 120000.0 | 2 | 2 | 1 | 39 | -1 | -1 | -1 | -1 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 316 | -632 | 316 | 316 |
| 22 | 23 | 70000.0 | 2 | 2 | 2 | 26 | 2 | 0 | 0 | 2 | ... | 0 | 0 | 0 | 0 | 39080 | 38863 | 45020 | 40405 | 46905 | 44192 |
| 23 | 24 | 450000.0 | 2 | 1 | 1 | 40 | -2 | -2 | -2 | -2 | ... | 0 | 0 | 0 | 0 | -13916 | 17947 | 913 | 560 | 0 | -1128 |
| 24 | 25 | 90000.0 | 1 | 1 | 2 | 23 | 0 | 0 | 0 | -1 | ... | 0 | 0 | 0 | 0 | -1013 | 7070 | -5398 | 4198 | 4315 | 6292 |
| 25 | 26 | 50000.0 | 1 | 3 | 2 | 23 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 45647 | 40384 | 35022 | 27535 | 28767 | 29049 |
| 26 | 27 | 60000.0 | 1 | 1 | 2 | 27 | 1 | -2 | -1 | -1 | ... | 0 | 0 | 0 | 0 | -109 | -1425 | 259 | -557 | 127 | -1189 |
| 27 | 28 | 50000.0 | 2 | 3 | 2 | 30 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 21241 | 14838 | 16163 | 16378 | 17931 | 18605 |
| 28 | 29 | 50000.0 | 2 | 3 | 1 | 47 | -1 | -1 | -1 | -1 | ... | 0 | 0 | 0 | 0 | -2765 | -6 | 1372 | -28390 | 30173 | 257 |
| 29 | 30 | 50000.0 | 1 | 1 | 2 | 26 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 13829 | 15075 | 16496 | 16907 | 16775 | 11400 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 29970 | 29971 | 360000.0 | 1 | 1 | 1 | 34 | -1 | -1 | -1 | 0 | ... | 0 | 0 | 0 | 0 | -19297 | -11849 | 55162 | 48952 | -10908 | 3407 |
| 29971 | 29972 | 80000.0 | 1 | 3 | 1 | 36 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 63159 | 64358 | 65749 | 67118 | 67370 | 70612 |
| 29972 | 29973 | 190000.0 | 1 | 1 | 1 | 37 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 19628 | 17024 | -19259 | 19108 | -128866 | 143682 |
| 29973 | 29974 | 230000.0 | 1 | 2 | 1 | 35 | 1 | -2 | -2 | -2 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 29974 | 29975 | 50000.0 | 1 | 2 | 1 | 37 | 1 | 2 | 2 | 2 | ... | 0 | 0 | 0 | 0 | 10904 | 6316 | 4328 | 2846 | 585 | 324 |
| 29975 | 29976 | 220000.0 | 1 | 2 | 1 | 41 | 0 | 0 | -1 | -1 | ... | 0 | 0 | 0 | 0 | 36235 | 2197 | -4555 | 4165 | -65 | -5198 |
| 29976 | 29977 | 40000.0 | 1 | 2 | 2 | 47 | 2 | 2 | 3 | 2 | ... | 0 | 0 | 0 | 0 | 48358 | 54892 | 51415 | 51259 | 43631 | 46934 |
| 29977 | 29978 | 420000.0 | 1 | 1 | 2 | 34 | 0 | 0 | 0 | 0 | ... | 1 | 0 | 0 | 0 | 124939 | 129721 | 134511 | 136195 | 139239 | 142954 |
| 29978 | 29979 | 310000.0 | 1 | 2 | 1 | 39 | 0 | 0 | 0 | 0 | ... | 1 | 0 | 0 | 0 | 228944 | 227978 | 223825 | 211360 | 208500 | 200616 |
| 29979 | 29980 | 180000.0 | 1 | 1 | 1 | 32 | -2 | -2 | -2 | -2 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 29980 | 29981 | 50000.0 | 1 | 3 | 2 | 42 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 33998 | 45123 | 44397 | 47360 | 15471 | 17694 |
| 29981 | 29982 | 50000.0 | 1 | 2 | 1 | 44 | 1 | 2 | 2 | 2 | ... | 0 | 0 | 0 | 0 | 36371 | 35072 | 33101 | 27675 | 22173 | 14062 |
| 29982 | 29983 | 90000.0 | 1 | 2 | 1 | 36 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 6252 | 7612 | 8806 | 10128 | 9536 | 14329 |
| 29983 | 29984 | 20000.0 | 1 | 2 | 1 | 44 | -2 | -2 | -2 | -2 | ... | 0 | 0 | 0 | 0 | -1068 | 152 | -178 | -6381 | 7411 | 18 |
| 29984 | 29985 | 30000.0 | 1 | 2 | 2 | 38 | -1 | -1 | -2 | -1 | ... | 0 | 0 | 0 | 0 | -608 | -2054 | 940 | -1064 | -1412 | 2319 |
| 29985 | 29986 | 240000.0 | 1 | 1 | 2 | 30 | -2 | -2 | -2 | -2 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 29986 | 29987 | 360000.0 | 1 | 1 | 2 | 35 | -1 | -1 | -2 | -2 | ... | 0 | 0 | 0 | 0 | 2220 | 0 | 0 | 0 | 0 | 0 |
| 29987 | 29988 | 130000.0 | 1 | 1 | 2 | 34 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 20292 | 12077 | -77454 | 104047 | 88681 | 93348 |
| 29988 | 29989 | 250000.0 | 1 | 1 | 1 | 34 | 0 | 0 | 0 | 0 | ... | 1 | 0 | 0 | 0 | 214640 | 244113 | 234064 | 239750 | 168005 | 173678 |
| 29989 | 29990 | 150000.0 | 1 | 1 | 2 | 35 | -1 | -1 | -1 | -1 | ... | 0 | 0 | 0 | 0 | -5629 | 9009 | -786 | 780 | 0 | 0 |
| 29990 | 29991 | 140000.0 | 1 | 2 | 1 | 41 | 0 | 0 | 0 | 0 | ... | 1 | 0 | 0 | 0 | 132325 | 130142 | 134882 | 136757 | 47675 | 44121 |
| 29991 | 29992 | 210000.0 | 1 | 2 | 1 | 34 | 3 | 2 | 2 | 2 | ... | 0 | 0 | 0 | 0 | 2500 | 2500 | 2500 | 2500 | 2500 | 2500 |
| 29992 | 29993 | 10000.0 | 1 | 3 | 1 | 43 | 0 | 0 | 0 | -2 | ... | 0 | 0 | 0 | 0 | 6802 | 10400 | 0 | 0 | 0 | 0 |
| 29993 | 29994 | 100000.0 | 1 | 1 | 2 | 38 | 0 | -1 | -1 | 0 | ... | 0 | 0 | 0 | 0 | 1042 | -110357 | 98996 | 67626 | 67473 | 53004 |
| 29994 | 29995 | 80000.0 | 1 | 2 | 2 | 34 | 2 | 2 | 2 | 2 | ... | 0 | 0 | 0 | 0 | 65557 | 74208 | 79384 | 70519 | 82607 | 77158 |
| 29995 | 29996 | 220000.0 | 1 | 3 | 1 | 39 | 0 | 0 | 0 | 0 | ... | 1 | 0 | 0 | 0 | 180448 | 172815 | 203362 | 84957 | 26237 | 14980 |
| 29996 | 29997 | 150000.0 | 1 | 3 | 2 | 43 | -1 | -1 | -1 | -1 | ... | 0 | 0 | 0 | 0 | -154 | -1698 | -5496 | 8850 | 5190 | 0 |
| 29997 | 29998 | 30000.0 | 1 | 2 | 2 | 37 | 4 | 3 | 2 | -1 | ... | 0 | 0 | 0 | 0 | 3565 | 3356 | -19242 | 16678 | 18582 | 16257 |
| 29998 | 29999 | 80000.0 | 1 | 3 | 1 | 41 | 1 | -1 | 0 | 0 | ... | 0 | 0 | 0 | 0 | -87545 | 74970 | 75126 | 50848 | -41109 | 47140 |
| 29999 | 30000 | 50000.0 | 1 | 2 | 1 | 46 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 45851 | 47105 | 48334 | 35535 | 31428 | 14313 |
30000 rows Ă— 38 columns
new_data = new_data.rename(columns = {'PAY_0':'PAY_1','default.payment.next.month':'default'})
new_data.dtypes
ID int64 LIMIT_BAL float64 SEX int64 EDUCATION int64 MARRIAGE int64 AGE int64 PAY_1 int64 PAY_2 int64 PAY_3 int64 PAY_4 int64 PAY_5 int64 PAY_6 int64 BILL_AMT1 float64 BILL_AMT2 float64 BILL_AMT3 float64 BILL_AMT4 float64 BILL_AMT5 float64 BILL_AMT6 float64 PAY_AMT1 float64 PAY_AMT2 float64 PAY_AMT3 float64 PAY_AMT4 float64 PAY_AMT5 float64 PAY_AMT6 float64 default int64 Number of missed payments int64 Average Bill Amount (TD) float64 Is Average Bill Amount less than 10K? int64 Is Average greater than 10k and less than 30k int64 Is Average greater than 30k and less than 50k int64 Is Average greater than 50k and less than 70k int64 Is Average greater than 70k and less than 100k int64 DUE_1 int64 DUE_2 int64 DUE_3 int64 DUE_4 int64 DUE_5 int64 DUE_6 int64 dtype: object
conda install -c plotly plotly=4.10.0
Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done
## Package Plan ##
environment location: C:\ProgramData\Anaconda3
added / updated specs:
- plotly=4.10.0
The following packages will be downloaded:
package | build
---------------------------|-----------------
ca-certificates-2020.12.8 | haa95532_0 122 KB
certifi-2020.12.5 | py37haa95532_0 141 KB
------------------------------------------------------------
Total: 262 KB
The following packages will be UPDATED:
ca-certificates 2020.7.22-0 --> 2020.12.8-haa95532_0
certifi 2020.6.20-py37_0 --> 2020.12.5-py37haa95532_0
conda 4.8.5-py37_0 --> 4.9.2-py37haa95532_0
Downloading and Extracting Packages
certifi-2020.12.5 | 141 KB | | 0%
certifi-2020.12.5 | 141 KB | #1 | 11%
certifi-2020.12.5 | 141 KB | #########1 | 91%
certifi-2020.12.5 | 141 KB | ########## | 100%
ca-certificates-2020 | 122 KB | | 0%
ca-certificates-2020 | 122 KB | ##6 | 26%
ca-certificates-2020 | 122 KB | ########## | 100%
ca-certificates-2020 | 122 KB | ########## | 100%
Preparing transaction: ...working... done
Verifying transaction: ...working... failed
Note: you may need to restart the kernel to use updated packages.
EnvironmentNotWritableError: The current user does not have write permissions to the target environment. environment location: C:\ProgramData\Anaconda3
#gendertype = {'1': 'Male','2':'Female'}
#new_data.SEX = new_data.loc[new_data.SEX == '1','gender'] = 'Male'
#new_data.SEX = new_data.loc[new_data.SEX == '2','gender'] = 'Female'
new_data['gender'] = new_data['SEX'].apply(lambda x: 'Male' if x == 1 else 'Female')
new_data.dtypes
new_data.gender = new_data['gender'].astype('category')
#new_data.EDUCATION = new_data['EDUCATION'].astype('category')
new_data.dtypes
ID int64 LIMIT_BAL float64 SEX int64 EDUCATION int64 MARRIAGE int64 AGE int64 PAY_1 int64 PAY_2 int64 PAY_3 int64 PAY_4 int64 PAY_5 int64 PAY_6 int64 BILL_AMT1 float64 BILL_AMT2 float64 BILL_AMT3 float64 BILL_AMT4 float64 BILL_AMT5 float64 BILL_AMT6 float64 PAY_AMT1 float64 PAY_AMT2 float64 PAY_AMT3 float64 PAY_AMT4 float64 PAY_AMT5 float64 PAY_AMT6 float64 default int64 Number of missed payments int64 Average Bill Amount (TD) float64 Is Average Bill Amount less than 10K? int64 Is Average greater than 10k and less than 30k int64 Is Average greater than 30k and less than 50k int64 Is Average greater than 50k and less than 70k int64 Is Average greater than 70k and less than 100k int64 DUE_1 int64 DUE_2 int64 DUE_3 int64 DUE_4 int64 DUE_5 int64 DUE_6 int64 gender category dtype: object
cdata = new_data
cdata
| ID | LIMIT_BAL | SEX | EDUCATION | MARRIAGE | AGE | PAY_1 | PAY_2 | PAY_3 | PAY_4 | ... | Is Average greater than 30k and less than 50k | Is Average greater than 50k and less than 70k | Is Average greater than 70k and less than 100k | DUE_1 | DUE_2 | DUE_3 | DUE_4 | DUE_5 | DUE_6 | gender | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 20000.0 | 2 | 2 | 1 | 24 | 2 | 2 | -1 | -1 | ... | 0 | 0 | 0 | 3913 | 2413 | 689 | 0 | 0 | 0 | Female |
| 1 | 2 | 120000.0 | 2 | 2 | 2 | 26 | -1 | 2 | 0 | 0 | ... | 0 | 0 | 0 | 2682 | 725 | 1682 | 2272 | 3455 | 1261 | Female |
| 2 | 3 | 90000.0 | 2 | 2 | 2 | 34 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 27721 | 12527 | 12559 | 13331 | 13948 | 10549 | Female |
| 3 | 4 | 50000.0 | 2 | 2 | 1 | 37 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 44990 | 46214 | 48091 | 27214 | 27890 | 28547 | Female |
| 4 | 5 | 50000.0 | 1 | 2 | 1 | 57 | -1 | 0 | -1 | 0 | ... | 0 | 0 | 0 | 6617 | -31011 | 25835 | 11940 | 18457 | 18452 | Male |
| 5 | 6 | 50000.0 | 1 | 1 | 2 | 37 | 0 | 0 | 0 | 0 | ... | 1 | 0 | 0 | 61900 | 55254 | 56951 | 18394 | 18619 | 19224 | Male |
| 6 | 7 | 500000.0 | 1 | 1 | 2 | 29 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 312965 | 372023 | 407007 | 522414 | 469253 | 460174 | Male |
| 7 | 8 | 100000.0 | 2 | 2 | 2 | 23 | 0 | -1 | -1 | 0 | ... | 0 | 0 | 0 | 11496 | -221 | 601 | -360 | -1846 | -975 | Female |
| 8 | 9 | 140000.0 | 2 | 3 | 1 | 28 | 0 | 0 | 2 | 0 | ... | 0 | 0 | 0 | 7956 | 14096 | 11676 | 11211 | 10793 | 2719 | Female |
| 9 | 10 | 20000.0 | 1 | 3 | 2 | 35 | -2 | -2 | -2 | -2 | ... | 0 | 0 | 0 | 0 | 0 | 0 | -13007 | 11885 | 13912 | Male |
| 10 | 11 | 200000.0 | 2 | 3 | 2 | 34 | 0 | 0 | 2 | 0 | ... | 0 | 0 | 0 | 8767 | 9775 | 5485 | 2213 | -1910 | 3665 | Female |
| 11 | 12 | 260000.0 | 2 | 1 | 2 | 51 | -1 | -1 | -1 | -1 | ... | 0 | 0 | 0 | -9557 | 11704 | 1383 | -13784 | 22287 | 10028 | Female |
| 12 | 13 | 630000.0 | 2 | 2 | 2 | 41 | -1 | 0 | -1 | -1 | ... | 0 | 0 | 0 | 11137 | 0 | 0 | 0 | 3630 | 2870 | Female |
| 13 | 14 | 70000.0 | 1 | 2 | 2 | 30 | 1 | 2 | 2 | 0 | ... | 0 | 0 | 0 | 62602 | 67369 | 62701 | 63782 | 34637 | 36894 | Male |
| 14 | 15 | 250000.0 | 1 | 1 | 2 | 29 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 67887 | 64060 | 60561 | 56696 | 53875 | 52512 | Male |
| 15 | 16 | 50000.0 | 2 | 3 | 3 | 23 | 1 | 2 | 0 | 0 | ... | 0 | 0 | 0 | 50614 | 27673 | 27016 | 27571 | 28231 | 29111 | Female |
| 16 | 17 | 20000.0 | 1 | 1 | 2 | 24 | 0 | 0 | 2 | 2 | ... | 0 | 0 | 0 | 12176 | 18010 | 15928 | 18338 | 16255 | 19104 | Male |
| 17 | 18 | 320000.0 | 1 | 1 | 1 | 49 | 0 | 0 | 0 | -1 | ... | 0 | 0 | 0 | 242928 | 236536 | 118723 | 50074 | -189743 | 145599 | Male |
| 18 | 19 | 360000.0 | 2 | 1 | 1 | 49 | 1 | -2 | -2 | -2 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | Female |
| 19 | 20 | 180000.0 | 2 | 1 | 2 | 29 | 1 | -2 | -2 | -2 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | Female |
| 20 | 21 | 130000.0 | 2 | 3 | 2 | 39 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 35358 | 26151 | 23489 | 18616 | 10872 | -32834 | Female |
| 21 | 22 | 120000.0 | 2 | 2 | 1 | 39 | -1 | -1 | -1 | -1 | ... | 0 | 0 | 0 | 0 | 0 | 316 | -632 | 316 | 316 | Female |
| 22 | 23 | 70000.0 | 2 | 2 | 2 | 26 | 2 | 0 | 0 | 2 | ... | 0 | 0 | 0 | 39080 | 38863 | 45020 | 40405 | 46905 | 44192 | Female |
| 23 | 24 | 450000.0 | 2 | 1 | 1 | 40 | -2 | -2 | -2 | -2 | ... | 0 | 0 | 0 | -13916 | 17947 | 913 | 560 | 0 | -1128 | Female |
| 24 | 25 | 90000.0 | 1 | 1 | 2 | 23 | 0 | 0 | 0 | -1 | ... | 0 | 0 | 0 | -1013 | 7070 | -5398 | 4198 | 4315 | 6292 | Male |
| 25 | 26 | 50000.0 | 1 | 3 | 2 | 23 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 45647 | 40384 | 35022 | 27535 | 28767 | 29049 | Male |
| 26 | 27 | 60000.0 | 1 | 1 | 2 | 27 | 1 | -2 | -1 | -1 | ... | 0 | 0 | 0 | -109 | -1425 | 259 | -557 | 127 | -1189 | Male |
| 27 | 28 | 50000.0 | 2 | 3 | 2 | 30 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 21241 | 14838 | 16163 | 16378 | 17931 | 18605 | Female |
| 28 | 29 | 50000.0 | 2 | 3 | 1 | 47 | -1 | -1 | -1 | -1 | ... | 0 | 0 | 0 | -2765 | -6 | 1372 | -28390 | 30173 | 257 | Female |
| 29 | 30 | 50000.0 | 1 | 1 | 2 | 26 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 13829 | 15075 | 16496 | 16907 | 16775 | 11400 | Male |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 29970 | 29971 | 360000.0 | 1 | 1 | 1 | 34 | -1 | -1 | -1 | 0 | ... | 0 | 0 | 0 | -19297 | -11849 | 55162 | 48952 | -10908 | 3407 | Male |
| 29971 | 29972 | 80000.0 | 1 | 3 | 1 | 36 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 63159 | 64358 | 65749 | 67118 | 67370 | 70612 | Male |
| 29972 | 29973 | 190000.0 | 1 | 1 | 1 | 37 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 19628 | 17024 | -19259 | 19108 | -128866 | 143682 | Male |
| 29973 | 29974 | 230000.0 | 1 | 2 | 1 | 35 | 1 | -2 | -2 | -2 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | Male |
| 29974 | 29975 | 50000.0 | 1 | 2 | 1 | 37 | 1 | 2 | 2 | 2 | ... | 0 | 0 | 0 | 10904 | 6316 | 4328 | 2846 | 585 | 324 | Male |
| 29975 | 29976 | 220000.0 | 1 | 2 | 1 | 41 | 0 | 0 | -1 | -1 | ... | 0 | 0 | 0 | 36235 | 2197 | -4555 | 4165 | -65 | -5198 | Male |
| 29976 | 29977 | 40000.0 | 1 | 2 | 2 | 47 | 2 | 2 | 3 | 2 | ... | 0 | 0 | 0 | 48358 | 54892 | 51415 | 51259 | 43631 | 46934 | Male |
| 29977 | 29978 | 420000.0 | 1 | 1 | 2 | 34 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 124939 | 129721 | 134511 | 136195 | 139239 | 142954 | Male |
| 29978 | 29979 | 310000.0 | 1 | 2 | 1 | 39 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 228944 | 227978 | 223825 | 211360 | 208500 | 200616 | Male |
| 29979 | 29980 | 180000.0 | 1 | 1 | 1 | 32 | -2 | -2 | -2 | -2 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | Male |
| 29980 | 29981 | 50000.0 | 1 | 3 | 2 | 42 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 33998 | 45123 | 44397 | 47360 | 15471 | 17694 | Male |
| 29981 | 29982 | 50000.0 | 1 | 2 | 1 | 44 | 1 | 2 | 2 | 2 | ... | 0 | 0 | 0 | 36371 | 35072 | 33101 | 27675 | 22173 | 14062 | Male |
| 29982 | 29983 | 90000.0 | 1 | 2 | 1 | 36 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 6252 | 7612 | 8806 | 10128 | 9536 | 14329 | Male |
| 29983 | 29984 | 20000.0 | 1 | 2 | 1 | 44 | -2 | -2 | -2 | -2 | ... | 0 | 0 | 0 | -1068 | 152 | -178 | -6381 | 7411 | 18 | Male |
| 29984 | 29985 | 30000.0 | 1 | 2 | 2 | 38 | -1 | -1 | -2 | -1 | ... | 0 | 0 | 0 | -608 | -2054 | 940 | -1064 | -1412 | 2319 | Male |
| 29985 | 29986 | 240000.0 | 1 | 1 | 2 | 30 | -2 | -2 | -2 | -2 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | Male |
| 29986 | 29987 | 360000.0 | 1 | 1 | 2 | 35 | -1 | -1 | -2 | -2 | ... | 0 | 0 | 0 | 2220 | 0 | 0 | 0 | 0 | 0 | Male |
| 29987 | 29988 | 130000.0 | 1 | 1 | 2 | 34 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 20292 | 12077 | -77454 | 104047 | 88681 | 93348 | Male |
| 29988 | 29989 | 250000.0 | 1 | 1 | 1 | 34 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 214640 | 244113 | 234064 | 239750 | 168005 | 173678 | Male |
| 29989 | 29990 | 150000.0 | 1 | 1 | 2 | 35 | -1 | -1 | -1 | -1 | ... | 0 | 0 | 0 | -5629 | 9009 | -786 | 780 | 0 | 0 | Male |
| 29990 | 29991 | 140000.0 | 1 | 2 | 1 | 41 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 132325 | 130142 | 134882 | 136757 | 47675 | 44121 | Male |
| 29991 | 29992 | 210000.0 | 1 | 2 | 1 | 34 | 3 | 2 | 2 | 2 | ... | 0 | 0 | 0 | 2500 | 2500 | 2500 | 2500 | 2500 | 2500 | Male |
| 29992 | 29993 | 10000.0 | 1 | 3 | 1 | 43 | 0 | 0 | 0 | -2 | ... | 0 | 0 | 0 | 6802 | 10400 | 0 | 0 | 0 | 0 | Male |
| 29993 | 29994 | 100000.0 | 1 | 1 | 2 | 38 | 0 | -1 | -1 | 0 | ... | 0 | 0 | 0 | 1042 | -110357 | 98996 | 67626 | 67473 | 53004 | Male |
| 29994 | 29995 | 80000.0 | 1 | 2 | 2 | 34 | 2 | 2 | 2 | 2 | ... | 0 | 0 | 0 | 65557 | 74208 | 79384 | 70519 | 82607 | 77158 | Male |
| 29995 | 29996 | 220000.0 | 1 | 3 | 1 | 39 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 180448 | 172815 | 203362 | 84957 | 26237 | 14980 | Male |
| 29996 | 29997 | 150000.0 | 1 | 3 | 2 | 43 | -1 | -1 | -1 | -1 | ... | 0 | 0 | 0 | -154 | -1698 | -5496 | 8850 | 5190 | 0 | Male |
| 29997 | 29998 | 30000.0 | 1 | 2 | 2 | 37 | 4 | 3 | 2 | -1 | ... | 0 | 0 | 0 | 3565 | 3356 | -19242 | 16678 | 18582 | 16257 | Male |
| 29998 | 29999 | 80000.0 | 1 | 3 | 1 | 41 | 1 | -1 | 0 | 0 | ... | 0 | 0 | 0 | -87545 | 74970 | 75126 | 50848 | -41109 | 47140 | Male |
| 29999 | 30000 | 50000.0 | 1 | 2 | 1 | 46 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 45851 | 47105 | 48334 | 35535 | 31428 | 14313 | Male |
30000 rows Ă— 39 columns
new_data.PAY_1 = new_data['PAY_1'].astype('category')
new_data.PAY_2 = new_data['PAY_2'].astype('category')
new_data.PAY_3 = new_data['PAY_3'].astype('category')
new_data.PAY_4 = new_data['PAY_4'].astype('category')
new_data.PAY_5 = new_data['PAY_5'].astype('category')
new_data.PAY_6 = new_data['PAY_6'].astype('category')
new_data
cdata = new_data.drop(columns=['ID','SEX','EDUCATION','MARRIAGE'])
cdata.dtypes
LIMIT_BAL float64 AGE int64 PAY_1 category PAY_2 category PAY_3 category PAY_4 category PAY_5 category PAY_6 category BILL_AMT1 float64 BILL_AMT2 float64 BILL_AMT3 float64 BILL_AMT4 float64 BILL_AMT5 float64 BILL_AMT6 float64 PAY_AMT1 float64 PAY_AMT2 float64 PAY_AMT3 float64 PAY_AMT4 float64 PAY_AMT5 float64 PAY_AMT6 float64 default int64 Number of missed payments int64 Average Bill Amount (TD) float64 Is Average Bill Amount less than 10K? int64 Is Average greater than 10k and less than 30k int64 Is Average greater than 30k and less than 50k int64 Is Average greater than 50k and less than 70k int64 Is Average greater than 70k and less than 100k int64 DUE_1 int64 DUE_2 int64 DUE_3 int64 DUE_4 int64 DUE_5 int64 DUE_6 int64 gender category dtype: object
gdata = cdata.groupby('default').mean()
gdata
| LIMIT_BAL | AGE | BILL_AMT1 | BILL_AMT2 | BILL_AMT3 | BILL_AMT4 | BILL_AMT5 | BILL_AMT6 | PAY_AMT1 | PAY_AMT2 | ... | Is Average greater than 10k and less than 30k | Is Average greater than 30k and less than 50k | Is Average greater than 50k and less than 70k | Is Average greater than 70k and less than 100k | DUE_1 | DUE_2 | DUE_3 | DUE_4 | DUE_5 | DUE_6 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| default | |||||||||||||||||||||
| 0 | 178099.726074 | 35.417266 | 51994.227273 | 49717.435670 | 47533.365605 | 43611.165254 | 40530.445343 | 39042.268704 | 6307.337357 | 6640.465074 | ... | 0.130671 | 0.009673 | 0.000428 | 0.000428 | 45686.889916 | 43076.970596 | 41779.868772 | 38310.635936 | 35282.225047 | 33322.896935 |
| 1 | 130109.656420 | 35.725738 | 48509.162297 | 47283.617842 | 45181.598855 | 42036.950573 | 39540.190476 | 38271.435503 | 3397.044153 | 3388.649638 | ... | 0.111814 | 0.009494 | 0.000301 | 0.000301 | 45112.118143 | 43894.968204 | 41814.247288 | 38881.323840 | 36321.050934 | 34829.953436 |
2 rows Ă— 27 columns
mdata = cdata.groupby('Number of missed payments').mean()
mdata
| LIMIT_BAL | AGE | BILL_AMT1 | BILL_AMT2 | BILL_AMT3 | BILL_AMT4 | BILL_AMT5 | BILL_AMT6 | PAY_AMT1 | PAY_AMT2 | ... | Is Average greater than 10k and less than 30k | Is Average greater than 30k and less than 50k | Is Average greater than 50k and less than 70k | Is Average greater than 70k and less than 100k | DUE_1 | DUE_2 | DUE_3 | DUE_4 | DUE_5 | DUE_6 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Number of missed payments | |||||||||||||||||||||
| 0 | 187259.429030 | 35.617731 | 55865.757664 | 52762.414831 | 49905.399127 | 45262.176158 | 41552.864733 | 40013.162059 | 6682.669209 | 7153.626160 | ... | 0.139481 | 0.009583 | 0.000552 | 0.000552 | 49183.088455 | 45608.788671 | 43639.297125 | 39555.829963 | 35795.038081 | 33779.852742 |
| 1 | 158388.160868 | 35.368504 | 31428.314731 | 32340.346814 | 31464.131270 | 29475.949390 | 28272.415273 | 26796.337325 | 4971.456620 | 4190.661771 | ... | 0.074559 | 0.010167 | 0.000000 | 0.000000 | 26456.858111 | 28149.685043 | 27793.322865 | 25960.255310 | 25218.136014 | 23544.760958 |
| 2 | 122588.730911 | 35.135861 | 49409.147973 | 46628.569774 | 45410.511322 | 41875.143760 | 39460.458136 | 38316.344392 | 2543.404423 | 3867.588731 | ... | 0.110058 | 0.009479 | 0.000000 | 0.000000 | 46865.743549 | 42760.981043 | 41881.339126 | 38578.436546 | 35898.720906 | 34620.944181 |
| 3 | 105173.310225 | 34.893414 | 50128.248700 | 49836.857019 | 47377.491334 | 44834.544194 | 42374.305893 | 40078.729636 | 2736.154246 | 2074.029463 | ... | 0.115251 | 0.007799 | 0.000000 | 0.000000 | 47392.094454 | 47762.827556 | 44236.995667 | 41942.696707 | 39655.598787 | 36915.093588 |
| 4 | 87392.218717 | 34.544690 | 47055.627760 | 48070.781283 | 48194.715037 | 46463.386961 | 44855.863302 | 44531.368034 | 2906.318612 | 2863.839117 | ... | 0.118822 | 0.009464 | 0.000000 | 0.000000 | 44149.309148 | 45206.942166 | 46445.375394 | 43744.176656 | 42492.286015 | 41801.973712 |
| 5 | 93422.818792 | 35.053691 | 53804.167785 | 53789.268456 | 53151.117450 | 52168.734899 | 49115.006711 | 46571.020134 | 2244.453020 | 2247.459732 | ... | 0.137584 | 0.006711 | 0.000000 | 0.000000 | 51559.714765 | 51541.808725 | 50625.261745 | 51226.060403 | 46437.567114 | 42145.604027 |
| 6 | 94049.217002 | 35.674124 | 53451.360925 | 54304.465324 | 55100.295302 | 55417.269202 | 55845.350485 | 55785.833706 | 2454.336316 | 2518.178971 | ... | 0.140940 | 0.011186 | 0.000746 | 0.000746 | 50997.024609 | 51786.286353 | 52944.134974 | 53162.136465 | 53789.138702 | 53358.777032 |
7 rows Ă— 27 columns
import plotly.express as px
data = px.data.tips()
fig = px.histogram(new_data, x="gender", color='default')
fig.update_xaxes(
constrain="domain",
categoryorder = "total descending"
# meanwhile compresses the xaxis by decreasing its "domain"
)
fig.show()
new_data.loc[(new_data['EDUCATION'] == 0) | (new_data['EDUCATION'] == 5)|(new_data['EDUCATION'] == 6),'EDUCATION'] = 4
new_data
new_data.EDUCATION = new_data['EDUCATION'].astype('category')
new_data.dtypes
new_data.EDUCATION.unique()
[2, 1, 3, 4] Categories (4, int64): [2, 1, 3, 4]
education_level = {1:'Graduate School',2:'University',3:'High School',4:'Others'}
new_data['Education_level']= new_data['EDUCATION'].map(education_level)
new_data.dtypes
ID int64 LIMIT_BAL float64 SEX int64 EDUCATION category MARRIAGE int64 AGE int64 PAY_1 category PAY_2 category PAY_3 category PAY_4 category PAY_5 category PAY_6 category BILL_AMT1 float64 BILL_AMT2 float64 BILL_AMT3 float64 BILL_AMT4 float64 BILL_AMT5 float64 BILL_AMT6 float64 PAY_AMT1 float64 PAY_AMT2 float64 PAY_AMT3 float64 PAY_AMT4 float64 PAY_AMT5 float64 PAY_AMT6 float64 default int64 Number of missed payments int64 Average Bill Amount (TD) float64 Is Average Bill Amount less than 10K? int64 Is Average greater than 10k and less than 30k int64 Is Average greater than 30k and less than 50k int64 Is Average greater than 50k and less than 70k int64 Is Average greater than 70k and less than 100k int64 DUE_1 int64 DUE_2 int64 DUE_3 int64 DUE_4 int64 DUE_5 int64 DUE_6 int64 gender category Education_level object dtype: object
data = px.data.tips()
fig = px.histogram(new_data, x="Education_level", color='default',histnorm = 'percent')
fig.update_xaxes(
range=[0,4], # sets the range of xaxis
constrain="domain",
categoryorder = "total descending"
# meanwhile compresses the xaxis by decreasing its "domain"
)
fig.show()
import plotly.express as px
df = px.data.tips()
fig = px.histogram(new_data, x="AGE", y="LIMIT_BAL",color = 'gender', histfunc='avg',nbins = 10)
fig.update_layout(
title="Age vs Credit Balance",
xaxis_title="AGE",
yaxis_title="Average Credit Given")
fig.show()
new_data.MARRIAGE.unique()
new_data.loc[(new_data['MARRIAGE'] == 0),'MARRIAGE'] = 3
new_data
new_data.MARRIAGE = new_data['MARRIAGE'].astype('category')
new_data.MARRIAGE.unique()
[1, 2, 3] Categories (3, int64): [1, 2, 3]
marriage_status = {1:'Single',2:'Married',3:'Others'}
new_data['Marriage_status']= new_data['MARRIAGE'].map(marriage_status)
new_data.dtypes
ID int64 LIMIT_BAL float64 SEX int64 EDUCATION category MARRIAGE category AGE int64 PAY_1 category PAY_2 category PAY_3 category PAY_4 category PAY_5 category PAY_6 category BILL_AMT1 float64 BILL_AMT2 float64 BILL_AMT3 float64 BILL_AMT4 float64 BILL_AMT5 float64 BILL_AMT6 float64 PAY_AMT1 float64 PAY_AMT2 float64 PAY_AMT3 float64 PAY_AMT4 float64 PAY_AMT5 float64 PAY_AMT6 float64 default int64 Number of missed payments int64 Average Bill Amount (TD) float64 Is Average Bill Amount less than 10K? int64 Is Average greater than 10k and less than 30k int64 Is Average greater than 30k and less than 50k int64 Is Average greater than 50k and less than 70k int64 Is Average greater than 70k and less than 100k int64 DUE_1 int64 DUE_2 int64 DUE_3 int64 DUE_4 int64 DUE_5 int64 DUE_6 int64 gender category Education_level object Marriage_status object dtype: object
data = px.data.tips()
fig = px.histogram(new_data, x="Marriage_status", color='default')
fig.update_xaxes(
categoryorder = "total descending"
)
fig.update_layout(
title=" Credit defaulters by marriage status of clients ")
fig.show()
import plotly.express as px
df = px.data.tips()
fig = px.pie(new_data, values='LIMIT_BAL', names='Education_level', color_discrete_sequence=px.colors.diverging.Spectral)
fig.update_layout(
title=" Given Credit distribution by Education level of Clients ")
fig.show()
new_data.dtypes
ID int64 LIMIT_BAL float64 SEX int64 EDUCATION category MARRIAGE category AGE int64 PAY_1 category PAY_2 category PAY_3 category PAY_4 category PAY_5 category PAY_6 category BILL_AMT1 float64 BILL_AMT2 float64 BILL_AMT3 float64 BILL_AMT4 float64 BILL_AMT5 float64 BILL_AMT6 float64 PAY_AMT1 float64 PAY_AMT2 float64 PAY_AMT3 float64 PAY_AMT4 float64 PAY_AMT5 float64 PAY_AMT6 float64 default int64 Number of missed payments int64 Average Bill Amount (TD) float64 Is Average Bill Amount less than 10K? int64 Is Average greater than 10k and less than 30k int64 Is Average greater than 30k and less than 50k int64 Is Average greater than 50k and less than 70k int64 Is Average greater than 70k and less than 100k int64 DUE_1 int64 DUE_2 int64 DUE_3 int64 DUE_4 int64 DUE_5 int64 DUE_6 int64 gender category Education_level object Marriage_status object dtype: object
ldata = pd.DataFrame(list(new_data['Number of missed payments']))
ldata.columns = ['Missed_payments']
ldata.loc[(ldata['Missed_payments'] < 0),'Missed_payments'] = 0
ldata.Missed_payments.unique()
ldata.Missed_payments = ldata['Missed_payments'].astype('category')
delay_status = {0:'Pay duly',1:'One_month',2:'Two_month',3:'Three_month',4:'Four_month',5:'Five_month',6:'Six_month',-1:'Pay duly'}
ldata['Missed_payments']= ldata['Missed_payments'].map(delay_status)
ldata['default'] = new_data['default']
ldata
| Missed_payments | default | |
|---|---|---|
| 0 | Two_month | 1 |
| 1 | Two_month | 1 |
| 2 | Pay duly | 0 |
| 3 | Pay duly | 0 |
| 4 | Pay duly | 0 |
| 5 | Pay duly | 0 |
| 6 | Pay duly | 0 |
| 7 | Pay duly | 0 |
| 8 | One_month | 0 |
| 9 | Pay duly | 0 |
| 10 | One_month | 0 |
| 11 | One_month | 0 |
| 12 | Pay duly | 0 |
| 13 | Four_month | 1 |
| 14 | Pay duly | 0 |
| 15 | Two_month | 0 |
| 16 | Four_month | 1 |
| 17 | Pay duly | 0 |
| 18 | One_month | 0 |
| 19 | One_month | 0 |
| 20 | Pay duly | 0 |
| 21 | Pay duly | 1 |
| 22 | Four_month | 1 |
| 23 | Pay duly | 1 |
| 24 | Pay duly | 0 |
| 25 | Pay duly | 0 |
| 26 | One_month | 1 |
| 27 | Pay duly | 0 |
| 28 | Pay duly | 0 |
| 29 | Pay duly | 0 |
| ... | ... | ... |
| 29970 | Pay duly | 0 |
| 29971 | Pay duly | 0 |
| 29972 | Pay duly | 0 |
| 29973 | One_month | 1 |
| 29974 | Four_month | 1 |
| 29975 | Pay duly | 0 |
| 29976 | Six_month | 1 |
| 29977 | Pay duly | 0 |
| 29978 | Pay duly | 0 |
| 29979 | Pay duly | 0 |
| 29980 | Pay duly | 0 |
| 29981 | Four_month | 0 |
| 29982 | Pay duly | 1 |
| 29983 | Pay duly | 0 |
| 29984 | Pay duly | 0 |
| 29985 | Pay duly | 0 |
| 29986 | Pay duly | 0 |
| 29987 | Pay duly | 0 |
| 29988 | Pay duly | 0 |
| 29989 | Pay duly | 0 |
| 29990 | Pay duly | 0 |
| 29991 | Six_month | 1 |
| 29992 | Pay duly | 0 |
| 29993 | Pay duly | 0 |
| 29994 | Six_month | 1 |
| 29995 | Pay duly | 0 |
| 29996 | Pay duly | 0 |
| 29997 | Three_month | 1 |
| 29998 | One_month | 1 |
| 29999 | Pay duly | 1 |
30000 rows Ă— 2 columns
data = px.data.tips()
fig = px.histogram(ldata, x="Missed_payments", color = "default")
fig.update_xaxes(
categoryorder = "total descending"
)
fig.update_layout(
title=" Payment_delay ")
fig.show()
new_data.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 30000 entries, 0 to 29999 Data columns (total 41 columns): ID 30000 non-null int64 LIMIT_BAL 30000 non-null float64 SEX 30000 non-null int64 EDUCATION 30000 non-null category MARRIAGE 30000 non-null category AGE 30000 non-null int64 PAY_1 30000 non-null category PAY_2 30000 non-null category PAY_3 30000 non-null category PAY_4 30000 non-null category PAY_5 30000 non-null category PAY_6 30000 non-null category BILL_AMT1 30000 non-null float64 BILL_AMT2 30000 non-null float64 BILL_AMT3 30000 non-null float64 BILL_AMT4 30000 non-null float64 BILL_AMT5 30000 non-null float64 BILL_AMT6 30000 non-null float64 PAY_AMT1 30000 non-null float64 PAY_AMT2 30000 non-null float64 PAY_AMT3 30000 non-null float64 PAY_AMT4 30000 non-null float64 PAY_AMT5 30000 non-null float64 PAY_AMT6 30000 non-null float64 default 30000 non-null int64 Number of missed payments 30000 non-null int64 Average Bill Amount (TD) 30000 non-null float64 Is Average Bill Amount less than 10K? 30000 non-null int64 Is Average greater than 10k and less than 30k 30000 non-null int64 Is Average greater than 30k and less than 50k 30000 non-null int64 Is Average greater than 50k and less than 70k 30000 non-null int64 Is Average greater than 70k and less than 100k 30000 non-null int64 DUE_1 30000 non-null int64 DUE_2 30000 non-null int64 DUE_3 30000 non-null int64 DUE_4 30000 non-null int64 DUE_5 30000 non-null int64 DUE_6 30000 non-null int64 gender 30000 non-null category Education_level 30000 non-null object Marriage_status 30000 non-null object dtypes: category(9), float64(14), int64(16), object(2) memory usage: 7.6+ MB
# Plotting heat maps along with the correlation values
plt.figure(figsize=(400,300))
cor = cdata.corr()
sns.set(font_scale=16)
sns.heatmap(cor, annot=True,annot_kws={"size": 150},vmax=.9, square=True,cmap = 'RdYlGn')
plt.show()